Handling Conjunctions in Named Entities

نویسندگان

  • Robert Dale
  • Pawel P. Mazur
چکیده

Named entity recognition consists of identifying ‘mentions’ — strings in a text that correspond to named entities — and then classifying each such mention as corresponding to a specific type of named entity, with typical categories being Company, Person and Location. The full range of named entity categories to be identified is usually application dependent. Introduced for the first time as a separately evaluated task at the Sixth Message Understanding Conference in 1995 (see, for example, [Grishman and Sundheim, 1995; 1996]), named entity recognition has attracted a considerable amount of research effort. Initially handled with hand crafted rules (as, for example, in many of the participating systems in MUC-6 and MUC-7) and later by means of statistical approaches (see, for example, [Sang, 2002; Sang and Meulder, 2003]), the state-of-the-art provides high performance for named entity identification and classification both for specific domains and for languageand domain-independent systems. However, our experience with existing software tells us that there are still some categories of named entities that remain problematic. In particular, very little work has explored the problem of the potential ambiguity of conjunctions appearing in named entity strings. Consider, for example, a string like the following:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Disambiguating Conjunctions in Named Entities

The recognition of named entities is now a welldeveloped area, with a range of symbolic and machine learning techniques that deliver high accuracy identification and categorisation of a variety of entity types. However, there are still some named entity phenomena that present problems for existing techniques; in particular, relatively little work has explored the disambiguation of conjunctions ...

متن کامل

Named Entity Extraction with Conjunction Disambiguation

The recognition of named entities is now a well-developed area, with a range of symbolic and machine learning techniques that deliver high accuracy extraction and categorisation of a variety of entity types. However, there are still some named entity phenomena that present problems for existing techniques; in particular, relatively little work has explored the disambiguation of conjunctions app...

متن کامل

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

Recognizing Names by the Meaning of Defining Syntactic Structures

We present a named entity recognition system that relies on interpreting the meaning of the syntactic structures that define named entities. The syntactic heads of the defining structures are learned from the training data and are later used to interpret the meaning of the syntactic structures in the test data. Semantic features are extracted based on the interpretations and are used to build a...

متن کامل

A Supervised Machine Learning Approach to Conjunction Disambiguation in Named Entities

Although the literature contains reports of very high accuracy figures for the recognition of named entities in text, there are still some named entity phenomena that remain problematic for existing text processing systems. One of these is the ambiguity of conjunctions in candidate named entity strings, an all-too-prevalent problem in corporate and legal documents. In this paper, we distinguish...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007